80 research outputs found

    A computer vision system for the recognition of trees in aerial photographs

    Get PDF
    Increasing problems of forest damage in Central Europe set the demand for an appropriate forest damage assessment tool. The Vision Expert System (VES) is presented which is capable of finding trees in color infrared aerial photographs. Concept and architecture of VES are discussed briefly. The system is applied to a multisource test data set. The processing of this multisource data set leads to a multiple interpretation result for one scene. An integration of these results will provide a better scene description by the vision system. This is achieved by an implementation of Steven's correlation algorithm

    Detect to Track and Track to Detect

    Full text link
    Recent approaches for high accuracy detection and tracking of object categories in video consist of complex multistage solutions that become more cumbersome each year. In this paper we propose a ConvNet architecture that jointly performs detection and tracking, solving the task in a simple and effective way. Our contributions are threefold: (i) we set up a ConvNet architecture for simultaneous detection and tracking, using a multi-task objective for frame-based object detection and across-frame track regression; (ii) we introduce correlation features that represent object co-occurrences across time to aid the ConvNet during tracking; and (iii) we link the frame level detections based on our across-frame tracklets to produce high accuracy detections at the video level. Our ConvNet architecture for spatiotemporal object detection is evaluated on the large-scale ImageNet VID dataset where it achieves state-of-the-art results. Our approach provides better single model performance than the winning method of the last ImageNet challenge while being conceptually much simpler. Finally, we show that by increasing the temporal stride we can dramatically increase the tracker speed.Comment: ICCV 2017. Code and models: https://github.com/feichtenhofer/Detect-Track Results: https://www.robots.ox.ac.uk/~vgg/research/detect-track

    Representations for Cognitive Vision : a Review of Appearance-Based, Spatio-Temporal, and Graph-Based Approaches

    Get PDF
    The emerging discipline of cognitive vision requires a proper representation of visual information including spatial and temporal relationships, scenes, events, semantics and context. This review article summarizes existing representational schemes in computer vision which might be useful for cognitive vision, a and discusses promising future research directions. The various approaches are categorized according to appearance-based, spatio-temporal, and graph-based representations for cognitive vision. While the representation of objects has been covered extensively in computer vision research, both from a reconstruction as well as from a recognition point of view, cognitive vision will also require new ideas how to represent scenes. We introduce new concepts for scene representations and discuss how these might be efficiently implemented in future cognitive vision systems

    Interpretation and Fusion - Recognition versus Reconstruction

    No full text
    The purpose of this contribution is to reframe the general problem of image understanding in the light of information fusion, in order to significantly reduce the complexity of vision problems. We introduce the framework of a general vision system based on an `active fusion' module, and discuss the requirements for such a system, as well as its feasibility. Dealing with visual information from diverse sources implies the necessity of matching and/or establishment of spatial relations between these sources. This is illustrated by results of a 2D affine matching algorithm applied to medical images. Initial Statements S1: (terminology) The terminology in information fusion lacks standardization. Terms like fusion, integration, sensor fusion, information fusion, integration of visual modules, consensus vision etc. appear in the fusion literature. Therefore, a "Glossary of Computer Vision Terms in Connection to Information Fusion" (see chapter ??) is prepared and distributed to all partici..

    Consistent Visual Information Processing Applied to Object Recognition, Landmark Definition, and Real-Time Tracking. VMV'01

    No full text
    The handling of situations where multiple visual information occurs requires the fusion of visual information. This is a very common task found in the processing of multisource / multitemporal datasets, in sensor fusion, and in all kinds of active vision systems. A general approach to this problem is presented which goes beyond previous information theoretic investigations. Starting from the paradigm of ‘Active Fusion’, where entropy is used as a measure to evaluate the expected gain in information from a potential data source, we develop the concept of data ‘consistency’. In multisource visual information processing, consistency can be expressed by vicinity in space, by similarity of visual landmarks or by higher level constraints like smoothness of motion trajectories, rigid body, or continuity constraints. Several sample applications are presented, including an active object recognition system, the definition of salient landmarks, and an optical tracking system. In summary, consistency evaluation is a powerful method to reduce complexity and to resolve otherwise ill-posed problems like ambiguity in computer vision.
    • …
    corecore